synthetic data

Terms from Artificial Intelligence: humans at the heart of algorithms

Page numbers are for draft copy at present; they will be replaced with correct numbers when final book is formatted. Chapter numbers are correct and will not change now.

Synthetic data is data generated by models or my manipulating real data. It can be useful for training deep neural networks when there is insufficient real data or real data is hard to obtain. For example, 3D simulations can be used to emulate near accident situations, such as pedestraians crossing buy roads or in the dark; these will hopefully be rare in real gatgered video data, but invaluable to help autonomous cars respond well to exerme events. Synthetic data can also help generalisation, for example by adding noise to images or addings cropped, rotated or resized copies. Synthetic data has also be sggestd as a way to reduce bias in data sets.

Defined on page 437

Used on Chap. 8: page 156; Chap. 18: page 437; Chap. 24: page 582